Metro Network Analysis involves exploring the network of metro systems to understand their structure, efficiency, and effectiveness. We will analyze routes, stations, traffic, connectivity, as well as other operational aspects. We will go through the task of Delhi Metro Network Analysis using Python.
Analyzing the metro network in a city helps improve urban transportation infrastructure, helping better plan the city and enhance commuter experiences. Floowing is the process we can use for the task of Metro Network Analysis of Delhi:
So, for the analysis of Delhi Metro Network, we need to have a dataset based on all metro lines in Delhi and how they connect with each other. I found an ideal dataset for this task. You can download the dataset from here.
import pandas as pd
import folium
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.io as pio
pio.templates.default = "plotly_white"
metro_data = pd.read_csv("Delhi-Metro-Network.csv")
Now let's examine the dataset for any missing or null values. then we take a look on the data types
# checking for missing values
missing_values = metro_data.isnull().sum()
print(missing_values)
Station ID 0 Station Name 0 Distance from Start (km) 0 Line 0 Opening Date 0 Station Layout 0 Latitude 0 Longitude 0 dtype: int64
# checking data types
data_types = metro_data.dtypes
print(data_types)
Station ID int64 Station Name object Distance from Start (km) float64 Line object Opening Date object Station Layout object Latitude float64 Longitude float64 dtype: object
#we can also use info() method to get the same details
metro_data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 285 entries, 0 to 284 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Station ID 285 non-null int64 1 Station Name 285 non-null object 2 Distance from Start (km) 285 non-null float64 3 Line 285 non-null object 4 Opening Date 285 non-null object 5 Station Layout 285 non-null object 6 Latitude 285 non-null float64 7 Longitude 285 non-null float64 dtypes: float64(3), int64(1), object(4) memory usage: 17.9+ KB
As seen, the Opening Date is set to object, we will change that to datetime format
# converting 'Opening Date' to datetime format
metro_data['Opening Date'] = pd.to_datetime(metro_data['Opening Date'])
metro_data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 285 entries, 0 to 284 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Station ID 285 non-null int64 1 Station Name 285 non-null object 2 Distance from Start (km) 285 non-null float64 3 Line 285 non-null object 4 Opening Date 285 non-null datetime64[ns] 5 Station Layout 285 non-null object 6 Latitude 285 non-null float64 7 Longitude 285 non-null float64 dtypes: datetime64[ns](1), float64(3), int64(1), object(3) memory usage: 17.9+ KB
First of all, visualizing the locations of the metro stations on a map will give a clear idea about what we are doing. It will give us an insight into the geographical distribution of the stations across Delhi. We will use the latitude and longitude data to plot each station.
For this, I’ll create a map with markers for each metro station. Each marker will represent a station, and we’ll be able to analyze aspects like station density and geographic spread. Let’s proceed with this visualization:
# defining a color scheme for the metro lines
line_colors = {
'Red line': 'red',
'Blue line': 'blue',
'Yellow line': 'beige',
'Green line': 'green',
'Voilet line': 'purple',
'Pink line': 'pink',
'Magenta line': 'darkred',
'Orange line': 'orange',
'Rapid Metro': 'cadetblue',
'Aqua line': 'black',
'Green line branch': 'lightgreen',
'Blue line branch': 'lightblue',
'Gray line': 'lightgray'
}
delhi_map_with_line_tooltip = folium.Map(location=[28.7041, 77.1025], zoom_start=10)
# adding colored markers for each metro station with line name in tooltip
for index, row in metro_data.iterrows():
line = row['Line']
color = line_colors.get(line, 'black') # Default color is black if line not found in the dictionary
folium.Marker(
location=[row['Latitude'], row['Longitude']],
popup=f"{row['Station Name']}",
tooltip=f"{row['Station Name']}, {line}",
icon=folium.Icon(color=color)
).add_to(delhi_map_with_line_tooltip)
# Displaying the updated map
delhi_map_with_line_tooltip
The above map shows the geographical distribution of Delhi Metro stations. Each marker represents a metro station. This map provides a visual explination of how the metro stations are spread across Delhi.
Now, we will address the growth of the Delhi Metro network over time: how many stations were opened each year and then visualize this growth. It can provide insights into the pace of metro network expansion and its development phases.
We will do the following:
Let’s proceed with this analysis:
metro_data['Opening Year'] = metro_data['Opening Date'].dt.year
# counting the number of stations opened each year
stations_per_year = metro_data['Opening Year'].value_counts().sort_index()
stations_per_year_df = stations_per_year.reset_index()
stations_per_year_df.columns = ['Year', 'Number of Stations']
fig = px.bar(stations_per_year_df, x='Year', y='Number of Stations',
title="Number of Metro Stations Opened Each Year in Delhi",
labels={'Year': 'Year', 'Number of Stations': 'Number of Stations Opened'})
fig.update_layout(xaxis_tickangle=-45, xaxis=dict(tickmode='linear'),
yaxis=dict(title='Number of Stations Opened'),
xaxis_title="Year")
fig.show()
The chart displays the number of stations opened per year. We can clearly notice that:
Now, we will focus on analyzing the metro lines in regard to the number of stations and the average distance between stations. This will shade light into each metro line characteristics, such as which line has more stations.
However, we would need to calculate the number of stations per line as well as the average distance between stations per line. Hence, the results will be visualized to provide better understanding.
Let's begin:
stations_per_line = metro_data['Line'].value_counts()
# calculating the total distance of each metro line (max distance from start)
total_distance_per_line = metro_data.groupby('Line')['Distance from Start (km)'].max()
avg_distance_per_line = total_distance_per_line / (stations_per_line - 1)
line_analysis = pd.DataFrame({
'Line': stations_per_line.index,
'Number of Stations': stations_per_line.values,
'Average Distance Between Stations (km)': avg_distance_per_line
})
# sorting the DataFrame by the number of stations
line_analysis = line_analysis.sort_values(by='Number of Stations', ascending=False)
line_analysis.reset_index(drop=True, inplace=True)
print(line_analysis)
Line Number of Stations \
0 Blue line 49
1 Pink line 38
2 Yellow line 37
3 Voilet line 34
4 Red line 29
5 Magenta line 25
6 Aqua line 21
7 Green line 21
8 Rapid Metro 11
9 Blue line branch 8
10 Orange line 6
11 Gray line 3
12 Green line branch 3
Average Distance Between Stations (km)
0 1.355000
1 1.097917
2 1.157143
3 1.950000
4 1.240000
5 1.050000
6 1.379167
7 4.160000
8 1.421622
9 1.000000
10 1.167857
11 1.318182
12 1.269444
# creating subplots
fig = make_subplots(rows=1, cols=2, subplot_titles=('Number of Stations Per Metro Line',
'Average Distance Between Stations Per Metro Line'),
horizontal_spacing=0.2)
# plot for Number of Stations per Line
fig.add_trace(
go.Bar(y=line_analysis['Line'], x=line_analysis['Number of Stations'],
orientation='h', name='Number of Stations', marker_color='crimson'),
row=1, col=1
)
# plot for Average Distance Between Stations
fig.add_trace(
go.Bar(y=line_analysis['Line'], x=line_analysis['Average Distance Between Stations (km)'],
orientation='h', name='Average Distance (km)', marker_color='navy'),
row=1, col=2
)
# update xaxis properties
fig.update_xaxes(title_text="Number of Stations", row=1, col=1)
fig.update_xaxes(title_text="Average Distance Between Stations (km)", row=1, col=2)
# update yaxis properties
fig.update_yaxes(title_text="Metro Line", row=1, col=1)
fig.update_yaxes(title_text="", row=1, col=2)
# update layout
fig.update_layout(height=600, width=1200, title_text="Metro Line Analysis", template="plotly_white")
fig.show()
The table presents a detailed analysis of the Delhi Metro lines, including the number of stations on each line and the average distance between stations.
The bar chart gives better view of the table details. It has two sides: one for the number of stations per line and another for the average distance between stations.
Now, it is time to explore the layouts of the stations (Elevated, Ground Level, Underground). We need to see if there are any patterns for such layouts. To do so, we will calcuate the frequency of each layout type and visualize them to have a better undestanding of their distribution.
Let's begin:
layout_counts = metro_data['Station Layout'].value_counts()
# creating the bar plot using Plotly
fig = px.bar(x=layout_counts.index, y=layout_counts.values,
labels={'x': 'Station Layout', 'y': 'Number of Stations'},
title='Distribution of Delhi Metro Station Layouts',
color=layout_counts.index,
color_continuous_scale='pastel')
# updating layout for better presentation
fig.update_layout(xaxis_title="Station Layout",
yaxis_title="Number of Stations",
coloraxis_showscale=False,
template="plotly_white")
fig.show()
The bar chart and the counts show the distribution of different station layouts in the Delhi Metro network. We can conclude that:
We have gone through the process of analyzing Delhi Metro Network. We tried to understand the main structure points, efficiency and effectiveness using the analysis of routes, stations, distance among other operational aspects